YC companies are defying the notion that building AI models requires immense resources by training or fine-tuning their own foundation models with YC's support, achieving remarkable feats like generating professional music and designing novel proteins within just three months. These 25 startups, leveraging YC's funding and technical resources, have developed innovative AI solutions across various fields, demonstrating that with creativity and strategic insights, significant advancements in AI are accessible to smaller teams.
Wednesday, April 3, 2024Meta plans to release an initial version of its next-generation Llama 3 large language model within the next month. The company will release a number of different models with different capabilities and versatilities during the course of the year. Llama 3 will be able to answer a wider range of questions compared to its predecessor, including questions regarding more controversial topics. Meta has not released any details about the model's size, but it is expected to have about 140 billion parameters - the biggest Llama 2 model has 70 billion.
Meta has released an 8B and 70B model with dramatically improved performance, particularly in reasoning, context length, and code. It is still training a 400B parameter model, which will match Opus in performance. These models are easily the most powerful available open models.
This guide explores using open-source embedding models to enhance AI projects. It covers criteria for model selection and methods for effective deployment. The guide utilizes Sentence Transformers, an open-source library, for practical examples.
Hugging Face is committing $10 million in free shared GPUs to help developers, academics, and startups create new AI technologies, aiming to counteract the centralization of AI advancements dominated by tech giants.
Elon Musk's AI company, xAI, is advancing its Grok chatbot to support multimodal inputs, allowing users to upload photos and receive text-based answers.
While the barrier to entry for building AI products has lowered, creating something effective beyond a demo remains a deceptively difficult endeavor. This series of articles identifies crucial lessons and methodologies for developing products based on large language models gathered by people who have been building real-world applications on top of ML systems over the past year. It is organized into three sections: tactical, operational, and strategic. This first part dives into the tactical nuts and bolts of working with large language models and shares best practices and common pitfalls around prompting, setting up retrieval-augmented generation, applying flow engineering, and evaluation and monitoring.
- Mistral releases a 22B code model with 32k context, boasting broad programming language performance.
Mistral released a 22B code model with 32k context. It is a powerful and fast model with broad performance across many programming languages. It has open weights and is available via Mistral's platform.
Amazon is enhancing its AI capabilities by hiring top talent from AI startup Adept and licensing its technology.
Anthropic has introduced prompt caching for its Claude models, allowing developers to cache frequently used context, significantly reducing costs and latency, with early users like Notion already benefiting from faster and more efficient AI-powered features.
French AI startup Mistral has launched Pixtral 12B, a 12-billion-parameter multimodal model capable of processing both images and text. Available via GitHub and Hugging Face, the model can be fine-tuned and used under an Apache 2.0 license. Its release follows Mistral's $645 million funding round and positions the company as a significant player in Europe's AI landscape.
Durk Kingma, a co-founder of OpenAI, has announced his new position at Anthropic, a company focused on AI development. In his announcement on social media, Kingma shared that he will primarily work remotely from the Netherlands but plans to visit the San Francisco Bay Area regularly. He expressed enthusiasm for joining Anthropic, highlighting that the company's approach to AI aligns with his own values. Kingma looks forward to collaborating with a talented team, including former colleagues from OpenAI and Google, to address the challenges in AI development. Kingma holds a Ph.D. in machine learning from the University of Amsterdam and has a rich background in AI research. Before co-founding OpenAI, he was a doctoral fellow at Google and later became a research scientist at OpenAI, where he led efforts in developing algorithms for generative AI models, such as DALL-E 3 and ChatGPT. After leaving OpenAI in 2018, he took on roles as an angel investor and advisor for AI startups, and he rejoined Google, contributing to its AI research division, Google Brain, until its merger with DeepMind in 2023. His hiring at Anthropic is part of a broader trend of the company attracting talent from OpenAI. Recently, Anthropic has also brought on Jan Leike, a former safety lead at OpenAI, and John Schulman, another co-founder. The company has made significant hires, including Mike Krieger, co-founder of Instagram, as its first head of product. Anthropic's CEO, Dario Amodei, previously served as VP of research at OpenAI and left due to differences over the company's direction, particularly its increasing commercial focus. Anthropic aims to distinguish itself by emphasizing safety in AI development, a commitment that resonates with Kingma's professional philosophy.
OpenAI recently launched its annual DevDay event in San Francisco, marking a significant shift in how the company engages with developers. This year's event introduced four major API updates designed to enhance the integration of OpenAI's AI models into various applications. Unlike the previous year, which featured a keynote by CEO Sam Altman, the 2024 DevDay adopted a more global approach, with additional events scheduled in London and Singapore. A standout feature unveiled at the event is the Realtime API, now in public beta, which facilitates speech-to-speech conversations using six preset voices. This new API simplifies the development of voice assistants by allowing developers to manage speech recognition, text processing, and text-to-speech conversion through a single API call, streamlining the entire process. In addition to the Realtime API, OpenAI introduced two new features aimed at helping developers optimize performance and reduce costs. The first, "model distillation," enables developers to fine-tune smaller, more affordable models using outputs from advanced models, potentially enhancing the relevance and accuracy of the outputs. The second feature, "prompt caching," accelerates inference by remembering frequently used prompts, offering a 50% discount on input tokens and improving processing times. OpenAI also expanded its fine-tuning capabilities to include images, allowing developers to customize the multimodal version of GPT-4o with both text and images. This advancement opens up new possibilities for applications such as visual search, object detection in autonomous vehicles, and medical image analysis. The absence of a keynote from Altman this year was notable, especially given the dramatic events surrounding his leadership in the past year. Instead, the focus was placed on the product team and the technology itself. Altman did attend the event and participated in a closing "fireside chat," reflecting on the significant changes since the last DevDay, including a dramatic decrease in costs and a substantial increase in token volume across OpenAI's systems. Overall, the 2024 DevDay emphasized OpenAI's commitment to empowering developers with advanced tools and features while navigating the complexities of its internal dynamics and the broader AI landscape.
On October 1, 2024, Simon Willison provided a live blog from OpenAI DevDay in San Francisco, where he shared real-time updates and insights from the event. The keynote began with a review of the new model, referred to as o1, showcasing various applications that utilize it. A significant announcement was made regarding the doubling of the rate limit for o1 to 10,000 requests per minute, aligning it with GPT-4. The event featured several demonstrations of the new real-time API, which allows for voice input and output through WebSockets. This API was showcased in various applications, including a travel agent demo and an AI assistant that could make phone calls to order food. The Speak language learning app also utilized the new API, which was set to roll out to developers that day. Another major announcement was the introduction of model customization, with fine-tuning capabilities now available for vision models, allowing developers to use images for fine-tuning. This feature could be applied in various fields, such as product recommendations and medical imaging. Additionally, OpenAI announced a significant reduction in cost-per-token, which is now 99% cheaper than two years prior, and introduced automatic prompt caching, providing a 50% discount on previously seen tokens. The blog detailed a session on structured outputs for reliable applications, highlighting the evolution of the tools mechanism and the importance of structured outputs in ensuring valid data formats, particularly in applications that connect to external systems. The structured outputs feature guarantees that the output will match a specified JSON schema, addressing previous issues with reliability in JSON responses. The session on model distillation emphasized the process of creating smaller, powerful models by fine-tuning them based on the outputs of larger models. This approach allows developers to scale applications effectively while managing costs and performance. Two new features were introduced to facilitate this process: stored completions for capturing interactions with models and a new evaluation tool for assessing model performance. The afternoon sessions included discussions on building multimodal applications with the new real-time API, which integrates audio input, processing, and output into a single component, enhancing the user experience. The pricing for the real-time API was also revealed, with costs associated with audio input and output. The day concluded with a fireside chat featuring OpenAI leaders Sam Altman and Kevin Weil, where they discussed the future of AI, the concept of AGI, and the importance of safety and alignment in AI development. They emphasized the iterative approach to product development and the need for continuous research to push the boundaries of AI capabilities. Overall, the event showcased significant advancements in OpenAI's technology, focusing on enhancing developer tools, improving model performance, and ensuring reliable outputs for various applications.
Durk Kingma, a co-founder of OpenAI, has announced his new position at Anthropic, a company focused on AI development. In his announcement on social media, Kingma shared that he will primarily work remotely from the Netherlands but plans to visit the San Francisco Bay Area regularly. While he did not specify which division of Anthropic he will be joining, he expressed enthusiasm for contributing to the company's mission of developing powerful AI systems in a responsible manner. Kingma's background includes a Ph.D. in machine learning from the University of Amsterdam and experience as a doctoral fellow at Google. He was part of OpenAI's founding team, where he led the algorithms team and worked on generative AI models, including notable projects like DALL-E 3 and ChatGPT. After leaving OpenAI in 2018, he took on roles as an angel investor and advisor for AI startups, and later rejoined Google, contributing to its AI research and development efforts. His hiring at Anthropic is part of a broader trend of the company attracting talent from OpenAI. Recently, Anthropic has also brought on Jan Leike, a former safety lead at OpenAI, and John Schulman, another co-founder. The company aims to differentiate itself by emphasizing safety in AI development, a focus that resonates with Kingma's own beliefs. Dario Amodei, the CEO of Anthropic, previously served as the VP of research at OpenAI and left the organization due to disagreements over its commercial direction. He has since recruited several former OpenAI employees to help establish Anthropic, which positions itself as a more safety-conscious alternative in the AI landscape.
Elon Musk recently hosted a recruiting event for his new AI startup, xAI, at the original headquarters of OpenAI in San Francisco. This gathering, which featured free food, drinks, and live music created by AI, was marked by heightened security measures, including metal detectors and ID checks. The event coincided with OpenAI's annual Dev Day, where CEO Sam Altman was discussing the company's significant funding achievements, creating a competitive atmosphere. During the event, Musk articulated his vision for xAI, emphasizing the goal of developing digital superintelligence that is as benign as possible. He invited attendees to join his mission to create useful applications from this intelligence. Musk expressed his belief that artificial general intelligence (AGI) could be achieved within a couple of years and compared the rapid growth of xAI to the SR-71 Blackbird, a high-speed reconnaissance aircraft known for its strategic advantage during the Cold War. He identified xAI, along with OpenAI, Anthropic, and Google, as the key players in the AI landscape for the next five years, aiming for xAI to achieve a level of dominance in AI similar to SpaceX's in the aerospace industry. xAI was founded in March 2023 and has quickly expanded from a small office to a larger space in Palo Alto. Musk has recruited a team from his other companies and brought in experienced researchers from leading tech firms. The startup secured $6 billion in funding, significantly boosting its valuation and resources. However, xAI's initial product, Grok, has faced challenges, relying on external technologies for core features due to the need for rapid development. Musk's competitive stance against OpenAI is fueled by a history of conflict, including his departure from the organization and subsequent legal disputes. He has expressed distrust in OpenAI's profit-driven model and aims to create a more open and accessible AI. The recruiting event attracted engineers from rival companies, highlighting Musk's ability to sell his vision and attract talent despite the fierce competition in the AI sector. Musk's approach to AI emphasizes speed and innovation, appealing to those who prefer a less conventional work environment. He believes that a "maximum, truth-seeking AI" is essential for achieving safety in AI development. The event was organized quickly, reflecting Musk's commitment to advancing xAI and his broader ambitions in the tech industry.
OpenAI recently launched its annual DevDay event in San Francisco, marking a significant shift in how the company engages with developers. This year's event introduced four major API updates designed to enhance the integration of OpenAI's AI models into various applications. Unlike the previous year, which featured a keynote by CEO Sam Altman, the 2024 DevDay adopted a more global approach, with additional events scheduled in London and Singapore. One of the standout features unveiled at the event is the Realtime API, now in public beta. This API allows for speech-to-speech conversations using six preset voices, simplifying the process of creating voice assistants. Previously, developers had to juggle multiple models for different tasks, but the Realtime API enables them to manage everything with a single API call. OpenAI also plans to enhance its Chat Completions API by adding audio input and output capabilities, allowing for more versatile interactions. In addition to the Realtime API, OpenAI introduced two new features aimed at helping developers optimize performance and reduce costs. The first, "model distillation," allows developers to fine-tune smaller, more affordable models using outputs from advanced models, potentially improving the relevance and accuracy of the results. The second feature, "prompt caching," speeds up the inference process by remembering frequently used prompts, offering significant cost savings and faster processing times. Another notable update is the expansion of fine-tuning capabilities to include images, referred to as "vision fine-tuning." This allows developers to customize the multimodal version of GPT-4o by incorporating both images and text, paving the way for advancements in visual search, object detection for autonomous vehicles, and medical image analysis. The absence of a keynote from Sam Altman this year was a notable change, especially given the dramatic events surrounding his leadership in the past year. Instead, the focus was placed on the technology and the product team. Altman did attend the event and participated in a closing "fireside chat," reflecting on the significant changes OpenAI has undergone since the last DevDay, including a drastic reduction in costs and a substantial increase in token volume. Overall, the 2024 DevDay emphasized OpenAI's commitment to empowering developers with new tools and capabilities while navigating the complexities of its recent organizational changes. The event showcased a clear direction towards enhancing AI applications and fostering innovation in the developer community.